Day 14: 模組擴充與Streamlit Demo程式

2025 iThome 鐵人賽

DAY 14

生成式 AI

從 RAG 到 Agentic RAG：30 天打造本機智慧檢索系統系列第 14 篇

17th鐵人賽

seedfood

團隊躺平的內捲小隊

2025-09-28 23:38:23

93 瀏覽

分享至

前言

經過昨天的模組介紹，相信大家對於RAG的程式組成有更清楚的概念，為了讓程式拆的更清楚，今天我們針對模組架構再做更細的拆分，並附上一個Streamlit的demo程式。

🏗️完整模組補充

RAG_base/
│
├── indexer/               # 文件上傳與索引
│   ├── config.py          # 相關設定
|   ├── requirements       # 相關套件
│   ├── ocr_loader.py      # 文字, 表格抽取並進行chunking後轉存文本資訊
│   └── indexer.py         # 向量化並寫入 vectorDB(Qdrant/ChromaDB)
│
├── retriever/             # 檢索與問答
│   ├── config.py          # 相關設定
|   ├── requirements       # 相關套件
│   └── retriever.py       # 檢索 + rerank + prompt 構建
│
├── model/                 # 模型管理
│   ├── __init__.py
│   ├── embeddings.py      # 向量模型
│   ├── reranker.py        # Rerank 模型
│   └── llm.py             # LLM (OpenAI 或本地模型)
│
├── frontend/              # Streamlit 前端
│   ├── __init__.py
│   └── app.py             # 主程式：文件上傳分頁 + 問答分頁
│
├── data/                  # 用來放原始文件 & 處理後資料
│   ├── raw/               # 原始上傳文件
│   └── processed/         # 已抽取 & chunk 化的資料
│
└── chromadb/              # ChromaDB 本地儲存

📊前端Streamlit

import streamlit as st
from indexer.indexer import process_file
from retriever.retriever import answer_question

# 頁面設定
st.set_page_config(page_title="RAG Demo", layout="wide")

# 側邊欄選單
page = st.sidebar.selectbox("選擇功能", ["📂 文件上傳", "💬 問答系統"])

if page == "📂 文件上傳":
    st.header("文件上傳與索引")
    uploaded_file = st.file_uploader("請選擇文件", type=["pdf", "docx", "txt"])
    
    if uploaded_file is not None:
        with st.spinner("處理文件中..."):
            # 呼叫 indexer 的 process_file
            result = process_file(uploaded_file)
        st.success(f"✅ 文件已處理並建立索引：{uploaded_file.name}")
        st.json(result)  # 可以顯示 chunk 結果或 metadata

elif page == "💬 問答系統":
    st.header("RAG 問答系統")
    query = st.text_input("輸入你的問題")
    
    if st.button("送出查詢"):
        with st.spinner("檢索並生成答案中..."):
            # 呼叫 retriever 的 answer_question
            answer, refs = answer_question(query)
        st.subheader("答案")
        st.write(answer)
        
        with st.expander("📖 參考來源"):
            for ref in refs:
                st.markdown(f"- {ref}")